Refine your search
Collections
Co-Authors
Journals
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z All
Thakkar, Amit
- Learning Using Heterogeneous Classifier in Data Mining
Abstract Views :186 |
PDF Views:2
Authors
Affiliations
1 Chandubhai S Patel Institute of Technology Changa, Gujarat, IN
2 Chandubhai S Patel Institute of Technology, Changa, Gujarat, IN
1 Chandubhai S Patel Institute of Technology Changa, Gujarat, IN
2 Chandubhai S Patel Institute of Technology, Changa, Gujarat, IN
Source
Data Mining and Knowledge Engineering, Vol 3, No 13 (2011), Pagination: 788-792Abstract
Data Mining can be considered an analytic process designed to explore business or market data to search for consistent patterns and/or systematic relationships between variables, and then to validate the findings by applying the detected patterns to new subsets of data. Data mining is useful for prediction. We can improve accuracy of different classifiers by combining various classifiers and taking their predictions. One such method is Stacking, an ensemble method in which a number of base classifiers are combined using one meta-classifier which learns their outputs. This enhances the benefits obtained by individual classifiers. This paper is a review work of different approaches proposed by various authors in their paper.Keywords
Ensemble of Classifiers, Bagging, Boosting, Staking, Troika.- Improved K-Means with Dimensionality Reduction Technique
Abstract Views :181 |
PDF Views:3
Authors
Affiliations
1 Charotar Institute of Technology Changa, Nadiad, Gujarat, IN
1 Charotar Institute of Technology Changa, Nadiad, Gujarat, IN
Source
Data Mining and Knowledge Engineering, Vol 3, No 12 (2011), Pagination: 722-725Abstract
Clustering is the process of finding groups of objects such that the objects in a group will be similar to one another and different from the objects in other groups. K-means is a well known partitioning based clustering technique that attempts to find a user specified number of clusters represented by their centroid. K-means clustering algorithm often does not work well for high dimension; hence, to improve the efficiency, we apply PCA, dimensionality reduction technique, on data set and obtain a reduced dataset containing possibly uncorrelated variables. The challenging task for any clustering method is to determine the number of clusters beforehand. To find the number of cluster, we apply EM method that finds number of clusters user should choose by determining a mixture of Gaussians that fit a given data set. Finally the experiment results shows that the use of techniques such as PCA and EM, improve the efficiency of K-means clustering.Keywords
Cluster, EM, K-Mean, PCA.- Comprehensive and Evolution Study Focusing Future Research Challenges in the Field of Multi Relational Data Mining Specific to Multi-Relational Classification Approaches
Abstract Views :215 |
PDF Views:2
Authors
Amit Thakkar
1,
Y. P. Kosta
2
Affiliations
1 Chandubhai S. Patel Institute of Technology, Changa, Gujarat, IN
2 Marwadi Group of Institutions, Rajkot, Gujarat, IN
1 Chandubhai S. Patel Institute of Technology, Changa, Gujarat, IN
2 Marwadi Group of Institutions, Rajkot, Gujarat, IN
Source
Data Mining and Knowledge Engineering, Vol 3, No 10 (2011), Pagination: 594-598Abstract
Most of today’s structured data is stored in relational databases. Thus, the task of learning from relational data has begun to receive significant attention in the literature. Unfortunately, most methods only utilize “flat” data representations. Thus, to apply these single-table data mining techniques, we are forced to incur a computational penalty by first converting the data into this “flat” form. As a result of this transformation, the data not only loses its compact representation but the semantic information present in the relations are reduced or eliminated. As an important task of multi-relational data mining, multi-relational classification can directly look for patterns that involve multiple relations from a relational database and have more advantages than propositional data mining approaches. According to the differences in knowledge representation and strategy, the paper addressed different kind of multi-relational classification approaches that are ILP-based, graph-based and relational database-based classification approaches and discussed each relational classification technology, their characteristics, the comparisons and several challenging researching problems in detail.Keywords
Multi-Relational Data Mining, Multi-Relational Classification, Inductive Logic Programming (ILP), Graph, Selection Graph, Tuple ID Propagation.- Classification using Generalization Based Decision Tree Induction along with Relevance Analysis Based on Relational Database
Abstract Views :196 |
PDF Views:3
Authors
Affiliations
1 Charotar Institute of Technology Changa, Gujarat, IN
2 Charotar Institute of Technology, Changa, Gujarat, IN
1 Charotar Institute of Technology Changa, Gujarat, IN
2 Charotar Institute of Technology, Changa, Gujarat, IN
Source
Data Mining and Knowledge Engineering, Vol 2, No 10 (2010), Pagination: 287-293Abstract
Classification is a process of sorting unknown values of certain attributes-of-interest based on the values of other attributes, and is a major challenge in data mining. A commonly used method is the decision tree. The efficiency of decision tree algorithms has been well established for relatively small data sets. However, this method of classification has problems when handling larger data sets, data having continuous numerical values, and has the tendency to favor multiplicity in terms of values associated with the attributes in the data set while making selection of the final determining attribute. In data mining applications, large training sets are common; therefore decision tree algorithms have limitations of scalability. Also in most data mining application, users have a little knowledge regarding which signature attribute should be selected for effective mining and the user is more dependent upon the capability of the algorithm. In this paper, we address selection of two things, one, the right signature attribute and the second, handle large data set. This we accomplish by proposing a new data classification method through integration of a set of sequential process that involves steps such as data cleaning; attribute oriented induction (identifying the signature attribute), relevance analysis as the preprocessing steps followed by induction of decision trees. This stepwise approach helps us to set simple extraction rules at multiple levels of abstraction and easily handles large data sets and continuous numerical values in a scalable way.Keywords
Data Mining, Classification, Data Cleaning, Decision Tree Induction, Relevance Analysis.- An Improved Expectation Maximization based Semi-Supervised Text Classification using Naïve Bayes and Support Vector Machine
Abstract Views :201 |
PDF Views:3
Authors
Affiliations
1 Department of Computer Engineering, Chandubhai S Patel Institute of Technology, Changa, Petlad, IN
2 Department of Information and Technology, Chandubhai S Patel Institute of Technology, Changa, Petlad, IN
3 U & PU Patel Department of Computer Engineering, Chandubhai S Patel Institute of Technology, Changa, Petlad, IN
1 Department of Computer Engineering, Chandubhai S Patel Institute of Technology, Changa, Petlad, IN
2 Department of Information and Technology, Chandubhai S Patel Institute of Technology, Changa, Petlad, IN
3 U & PU Patel Department of Computer Engineering, Chandubhai S Patel Institute of Technology, Changa, Petlad, IN
Source
Artificial Intelligent Systems and Machine Learning, Vol 4, No 5 (2012), Pagination: 330-335Abstract
With the development of Internet and the emergence of a large number of text resources, the automatic text classification has become a research hotspot. As number of training documents increases, accuracy of Text Classification increases. Traditional classifiers (Supervised learning) use only labeled data for training. Labeled instances are often difficult, expensive, or time consuming to obtain. Meanwhile unlabeled data may be relatively easy to collect. Semi-Supervised Learning makes use of both labeled and unlabeled data. Several researchers have given algorithms for Text Classification using Semi-Supervised Learning. But still improving accuracy of Text Classification using Semi-Supervised Learning is a challenge. In the iterative process in the standard Expectation Maximization (EM) based semi-supervised learning, some unlabeled samples are misclassified by the current classifier because the initial labeled samples are not enough. To overcome this limitation, an EM based Semi-Supervised Learning algorithm using Naïve Bayesian and Support vector machine is proposed in this paper to improve accuracy of text classification using semi-supervised learning.Keywords
Expectation Maximization (EM), Naïve Bayes (NB), Support Vector Machine (SVM), Semi-Supervised Machine (SSL).- A Novel Approach for Making Recommendation using Skyline Query based on user Location and Preference
Abstract Views :161 |
PDF Views:0
Authors
Affiliations
1 Department of Information Technology, CSPIT, CHARUSAT, Anand - 388421, Gujarat, IN
1 Department of Information Technology, CSPIT, CHARUSAT, Anand - 388421, Gujarat, IN
Source
Indian Journal of Science and Technology, Vol 9, No 30 (2016), Pagination:Abstract
Objectives: To propose a method to handle large number of user and to improve the accuracy and quality of recommendation system. Methods/Statistical Analysis: This paper presents an effective method to identify user location based on his/her preference using Skyline query outline Dominated object. Dominance object suggests that an object falls under good or better in all dimension or good at least one dimension. Skyline query using Recommendation system has increased in recent years. Skyline query using recommendation system mainly used location-based services to find the nearest location, based on user preference. Location-based Services are information services and have a number of uses in social networking. Location-based Service finds the nearest location based on user preferences but not provide location based on similarity and rating. So, the user is not satisfied by the given result. Findings: To resolve above problem, the collaborative filtering technique, K-nearest neighbor algorithm and Ranking Scheme being used by us. Using Collaborative filtering technique, we find the similarity and rating of an item. Using K-nearest neighbor approach finds the nearest distance of the similar item and ranking technique being used by us, to choose the most nearest location. In this paper we take temporary dataset and mathematically evaluate our proposed system. Application/Improvements: In future, we will develop web tool which identify location and display result on map. We will also check user s' past movement history based on content based recommendation system. Skyline query using recommendation system is use various domain i.e. House Rent/buying, travel and tourism business.Keywords
Collaborative Filtering Technique, Dominated Object, K-Nearest Neighbor, Recommendation System, Skyline Query.- Education Data Mining, Visualization and Sentiment Analysis of Coursera Course Review
Abstract Views :109 |
PDF Views:0
Authors
Dhaval Bhoi
1,
Amit Thakkar
2
Affiliations
1 U & P U. Patel Department of Computer Engineering., IN
2 Department of Computer Science & Engineering, Chandubhai S. Patel Institute of Technology, Faculty of Technology & Engineering, Charotar University of Science and Technology, Changa – 388421, Gujarat, IN
1 U & P U. Patel Department of Computer Engineering., IN
2 Department of Computer Science & Engineering, Chandubhai S. Patel Institute of Technology, Faculty of Technology & Engineering, Charotar University of Science and Technology, Changa – 388421, Gujarat, IN
Source
Journal of Engineering Education Transformations, Vol 36, No 2 (2022), Pagination: 169-177Abstract
Objective: No Decisions are good or bed they are taken based on the available data. It is very much essential to represent the data in the right form to the right people and at the right time. Higher Engineering Institutes (HEI) is having a plethora of information available to them. Most of the available data are not used properly and remain just as dead storage. Methods: In this study, we have shown the importance of data visualization using a case study on Coursera review dataset. Different useful tools that support improving an Education System are summarized. Sentiment analysis is performed for coursera course review dataset using deep learning method. At the end, dashboard is also created to visualize student data using powerBI tool. Results: Uses of different visualization tools can help to improve the education system and its performance. The Sentiment expressed by students will help to improve the teaching-learning process and research contribution significantly as they are the major components for evaluation when any HEI wants to receive NAAC [National Assessment and Accreditation Council] approval for benefitting all stakeholders of the HEI. Conclusions: Proper analysis of available data and their proper visualization can help us to improve the education system to a great extent in terms of improving the most important factors like student teaching- learning and their placement to make their future bright. Students expressed sentiments are also key features to analyze the success of the teaching-learning process for both teachers and students as well. We have also used our institute students' data to g enerate a d ash bo ard t hat con tain s s tu den t information from a different perspective that can help higher authorities to make better fruitful decisions.Keywords
Education Data Mining, Dashboard, Data Visualization, Sentiment.References
- A picture is worth a thousand words - Wikipedia.(n.d.).Retrieved February 10, 2022, from https://en.wikipedia.org/wiki/A_picture_is_wor th_a_thousand_words
- Bhadri,G.N. & Patil, L.R.(2022).Blended Learning: An effective approach for Online Teaching and Learning. Journal of Engineering Education Transformations, 35(Special issue), 53–60.
- Cabada,R.Z.,Lucia M. Estrada,B. & Oramas, R.(n.d.).Mining of educational opinions with deeplearning.https://www.researchgate.net/publication/ 331877377
- Jha, S. (2020). A case study of implementation of active - cooperative learning approaches introduced through a faculty development programme and their effects on the pass percentage of undergraduate engineering students. Journal of Engineering Education Transformations, 34(1),7–11.https://doi.org/10.16920/jeet/2020/v34i1/15500 7
- KABIR, A. I., KARIM, R., NEWAZ, S., & HOSSAIN, [6] M. I. (2018). The Power of Social Media Analytics: Text Analytics Based on Sentiment Analysis and Word Clouds on R. Informatica Economica,22(1/2018),25–38.https://doi.org/10.12948/issn14531305/22.1.20 18.03
- Kaggle: Your Machine Learning and Data Science Community. (n.d.). Retrieved February 10, 2022, from https://www.kaggle.com/
- Krishnan, V. (2017). IR @ INFLIBNET: Research Data Analysis with Power BI. https://ir.inflibnet.ac.in/handle/ 1944/2116
- Liu, B. (2012). Sentiment Analysis and Opinion Mining. Morgan & Claypool Publishers.
- Nadj, M., Maedche, A., & Schieder, C.(2020). The effect of interactive analytical dashboard features on situation awareness and task performance. Decision Support Systems,135,113322.https://doi.org/10.1016/J.DSS.2020.113322
- Rajarapollu, P. Bansode, N. V. & Katkar, V.(2022).ICT-A Tool to Enhance Teaching Learning Activity in Technical Education. Journal of Engineering Education Transformations,35(Special Issue),14–18.
- Sapountzi, A. & Psannis, K. E.(2018). Social networking data analysis tools & challenges. Future Generation Computer Systems, 86,893–913.https://doi.org/10.1016/j.future.2016.10.019
- Zentner, A., Covit, R., & Guevarra, D. (2019). Exploring Effective Data Visualization Strategies in Higher Education. SSRN Electronic Journal. https://doi.org/10.2139/ssrn.3322856.